Finding trends and statistical patterns in name mentions in news
نویسندگان
چکیده
We extract the individual names of persons mentioned in news reportsfrom a Philippine-based daily in the English language from 2010-2012. Names areextracted using a learning algorithm that filters adjacent capitalized words and runs itthrough a database of non-names grown through training. The number of mentions ofindividual names shows strong temporal fluctuations, indicative of the nature of “hot”trends and issues in society. Despite these strong variations, however, we observe sta-ble rank-frequency distributions across different years in the form of power-laws withscaling exponents α = 0.7, reminiscent of the Zipf’s law observed in lexical (i.e. non-name) words. Additionally, we observe that the adjusted frequency for each rank, orthe frequency divided by the number of unique names having the same rank, shows adistribution with dual scaling behavior, with the higher-ranked names preserving theα exponent and the lower-ranked ones showing a power-law exponent α′ = 2.9. Wereproduced the results using a model wherein the names are taken from a Barabasi-Albert network representing the social structure of the system. These results suggestthat names, which represent individuals in the society, are archived differently fromregular words.
منابع مشابه
مطالعۀ الگوهای جمعیتشناختی و رفتاری خوانندگان برای اشاعۀ گزینشی اخبار
Purpose: The current research focuses on selective dissemination of news and aims at finding patterns for recognition of readers’ favorite news through web mining technique. Method: Data for this research was collected from the Yahoo News Website. The source of news was Associated Press. 840 news dated between 2011/3/1 and 2011/5/10 was analyzed through subject clustering technique. Findings:...
متن کاملFinding Potential News from Trends Originating in the Blogosphere
Tracking current population interests by trends in online media of entities and topics has become increasingly popular. But while notable world events often spur online public discussion, some have been observed originating in social media postings. A natural question arises: Can analysis of social media trends be used to find mainstream newsworthy material? The work reported here takes initial...
متن کاملThematic Progression Patterns in the English News and the Persian Translation
Thematic progression pattern as the method of development of the text insures that the reader follows the right path in understanding the text; in this regard, this subject is attracting considerable interest among discourse analysts. This paper calls into question the status of thematic progression in the process of translating English news into Persian. With this in mind, we analyzed the them...
متن کاملA Probabilistic Model for Canonicalizing Named Entity Mentions
We present a statistical model for canonicalizing named entity mentions into a table whose rows represent entities and whose columns are attributes (or parts of attributes). The model is novel in that it incorporates entity context, surface features, firstorder dependencies among attribute-parts, and a notion of noise. Transductive learning from a few seeds and a collection of mention tokens co...
متن کاملFrame Labeling of Competing Narratives in Journalistic Translation
Studying translations during the time of conflict has gained currency in the recent decade in translation studies. One of the cases in which conflict manifests itself is in the way different countries choose to name an event or a geographical location, for example. This study set out to understand how translation of rival names and labeling was carried out in Iranian state-run news agencies. To...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1507.02449 شماره
صفحات -
تاریخ انتشار 2015